Skip to content

ci(compat): trigger Backend-Compat-Test on real graph_storage path#1926

Merged
earayu merged 1 commit into
mainfrom
chenyexuan/task-61-compat-workflow-paths-fix
Apr 30, 2026
Merged

ci(compat): trigger Backend-Compat-Test on real graph_storage path#1926
earayu merged 1 commit into
mainfrom
chenyexuan/task-61-compat-workflow-paths-fix

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 30, 2026

Summary

`compat-test.yml` `pull_request` paths filter watches `aperag/domains/knowledge_graph/graphindex/` and `aperag/vectorstore/`, but the actual graph store backends were moved to `aperag/indexing/graph_storage/{neo4j,nebula,postgres}.py`. Any PR touching those files therefore bypassed Backend-Compat-Test — exactly the workflow that exists to catch cross-adapter regressions.

This is task #61 P0 surfaced by chenyexuan workflow-gate audit (msg=f298011e).

Why this matters

冬柏 (msg=f6416178) flagged `GET /graphs/labels` Neo4j 500 vs Nebula/PG pass as the canonical P0 candidate for task #61. That regression should have been caught by Backend-Compat-Test, but the workflow never ran for the relevant PRs because of this stale filter. Today's edits to `aperag/indexing/graph_storage/neo4j.py` (mtime 2026-04-30) similarly skipped the gate.

Change

Add `aperag/indexing/graph_storage/` to the paths filter. Keep the legacy `aperag/domains/knowledge_graph/graphindex/` entry as a defensive fallback for any follow-up commits that still touch the old tree (low cost, removes a footgun).

Test plan

  • Pure CI configuration change; no test logic, no production code.
  • git diff --check pass.
  • Architect msg=edbdd2b4 standing ratify (small fix, meta-quality issue, 0 risk).
  • Sediment candidate Lesson feat: chat #16 / Lesson fix: poetry dependencies #15 v2 (workflow path drift bypass) noted; huangheng follow-up sub-PR will fold.

CR

🤖 Generated with Claude Code

The pull_request paths filter on compat-test.yml only watched
``aperag/domains/knowledge_graph/graphindex/**`` and
``aperag/vectorstore/**``, but the actual graph store backends were
moved to ``aperag/indexing/graph_storage/{neo4j,nebula,postgres}.py``.
Any PR touching those files therefore bypassed Backend-Compat-Test —
exactly the workflow that exists to catch cross-adapter regressions
(e.g. the ``GET /graphs/labels`` 500 on Neo4j vs. pass on Nebula/PG
that 冬柏 surfaced as a P0 candidate, or today's neo4j.py edits).

Add ``aperag/indexing/graph_storage/**`` to the trigger list. Keep
the legacy ``aperag/domains/knowledge_graph/graphindex/**`` entry as
a defensive fallback for any follow-up commits that still touch the
old tree.

Pure CI configuration change; no test logic, no production code.
This is task #61 P0 surfaced by chenyexuan's workflow-gate audit
slice (cite trail in #indexing优化 msg=f298011e).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
Weston msg=13dd5e91 BLOCKER (score normalization severity drift):
保持 P0-V3+V4 P0 across §1.1 / §2.2 / §5.3 — score 方向是 caller
语义硬契约,不能在 PGVector/Qdrant 间显示反向。§2.2 加 P0-V3+V4
显式行 + §5.3 加 test_score_normalization_in_vector.py boundary
test (跨 metric × 跨 adapter 全 6 cell parametrize).

Streaming integrations (5 lane):

1. Bryce msg=23a2f514 P0-V1 first-principles 重新定性 — Qdrant
   legacy mode tenant isolation 是 collection name level 不是 query
   filter level (verify qdrant_connector.py:442-446),下沉 P1-V4
   defense-in-depth (legacy mode deprecation follow-up 候选).

2. Bryce msg=8e895471 11 vector findings — 4 P0 (cross-tenant
   下沉 / filter silent / score V3+V4) + 3 P1 (collection init /
   batch atomicity / filter Or 语义) + 4 P2.

3. dongdong msg=4201465a + PR #1929 + cuiwenbo msg=bcec38ad —
   P0-D1 Helm worker Neo4j env missing (Singapore graph viz
   root-cause); P1-D1 e2e shape matrix gap; P1-D2 Nebula no Helm
   first-class; P1-D3 typed schema 缺 vector backend exposure.

4. chenyexuan NIT — Lesson #16 candidate cite added §6.

5. Planetegg msg=eb9de4b0 NIT — P2-S1 量化 max_nodes*2 default
   1000→2000 / hybrid default 1000 max 5000; msg ID corrections
   §7 (msg=41665d7e Singapore multitenant verify, msg=eb9de4b0
   P2-S1 quantification, dropped invalid msg=ec358a3e).

冬柏 PR #1927 commit b2234ae fold-in §5.3 (38 cases incl
zero-side-effect + replay idempotency post-NIT).

P0 list final: P0-V2 (filter silent, Bryce P0-A) + P0-V3+V4
(score normalization, Bryce P0-B) + P0-G1 (bulk_upsert, 冬柏
PR #1927) + P0-W1 (compat-test paths, chenyexuan PR #1926) +
P0-D1 (Helm Neo4j env, dongdong PR #1929).
@earayu earayu merged commit a0403cf into main Apr 30, 2026
17 of 18 checks passed
@earayu earayu deleted the chenyexuan/task-61-compat-workflow-paths-fix branch April 30, 2026 05:00
earayu added a commit that referenced this pull request Apr 30, 2026
…ss-backend

PM @不穷 elevated this Protocol method as a P0 audit gap (msg=10b753e8).
Until now ``bulk_upsert_entity_with_lineage_parts`` (Wave 8 W8-2) had
no cross-backend test in `tests/integration/compat/`, even though all
three production backends (Postgres / Neo4j / Nebula) implement it
and the indexing worker uses it for the LineageEntityMerger merge step.

Bulk write paths are exactly where backend differences emerge — batch
size limits, transaction atomicity, error handling, dedup contract —
and the lack of a parametrized matrix here meant any silent drift in
the bulk semantics would survive merge.

This adds 7 new parametrized cases that pin the Protocol contract
declared in `aperag/indexing/graph.py:575+`:

* empty parts is a no-op (no implicit row creation)
* mixed-name parts raise ValueError (atomicity guarantee)
* round-trip: 3 distinct (document_id, parse_version) parts visible after
* dedup last-wins within a single bulk call
* bulk replaces existing rows on matching key (same as single upsert)
* bulk with distinct keys appends, never wipes pre-existing lineage
* per-part entity_type follows last-wins rule

Coverage delta: 30 → 37 cross-backend cases (collect-only verified).

Sister to chenyexuan PR #1926 — without that workflow path fix, this
test never triggered on PRs that touch `aperag/indexing/graph_storage/*`.
Both PRs together restore real CI gating on cross-backend regressions
for the LineageGraphStore Protocol surface.

Part of task #61 DB compat audit (earayu2 directive msg=f26b703e),
testing-lane slice (task #67, claimed via msg=e02c3028).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
…audit (#1928)

* docs(task-61): DB adapter compat spec v1 — vector + graph cross-impl audit

Architect spec v1 起草 per earayu2 directive (msg=8b989470 / msg=2bad8e75
/ msg=f26b703e) + PM 不穷 task #72 dispatch.

Streaming evidence integration from 8 lanes:
- huangheng msg=ed2f2973: 3 vector P0 candidates (cross-tenant /
  filter silent / collection init)
- Bryce msg=8e895471 task #69: 11 vector findings (4 P0 + 3 P1 + 4 P2,
  including upgraded score normalization P0-V3/V4)
- 冬柏 msg=3e93bb64 task #67: 3 missing Protocol method tests
  (bulk_upsert_entity_with_lineage_parts P0 + remove_relation_lineage
  P1 + list_entities P1)
- chenyexuan msg=f298011e + PR #1926: workflow paths filter dead
  reference P0-W1 (in flight)
- cuiwenbo msg=dfebf706 task #70: FE/UX 3 candidates (score, viz error
  vs empty, confidence_score)
- Planetegg msg=db7fb085 + msg=41906f4 + msg=41665d7e task #65: alias
  resolution gather P2-S1 + Singapore QDRANT_MULTITENANT=True (no
  hot-fix needed) + env shape verify
- ziang task #64 graph store audit (in_progress, will fold-in)
- dongdong task #71 deploy/typed schema (in_progress, will fold-in)

Spec structure:
- §1 inventory by lane with file:line evidence
- §2 缺口 by severity (P0 CRITICAL hot-fix candidate / P0 必修 / P1
  允许差异 declare / P2 性能优化 / YAGNI)
- §3 三层 design direction per Weston msg=85e527e3 framework
- §4 sub-task dispatch (Phase A 8 lane parallel + Phase B per-P0
  three-PR-pattern + Phase C P2 + Phase D PR #1926 unblock)
- §5 acceptance: P0/P1 standards + boundary test gate + e2e + sample
  limitation免责
- §6 CR mandatory checklist citing Lesson #11-#16 family from
  PR #1916/#1924/#1922 sediment + new Lesson #16 candidate (workflow
  paths dead reference)

Sample limitation: spec evidence from streaming surface, not
huangzhangshu collected gap list — fix-forward amend after
huangzhangshu lane completes + Bryce/ziang audit slice输出.

Not blocking: PR #1925 task #30 B3 default=2, PR #1926 compat-test
paths filter, Singapore 2pm release (env fix separate lane), task #31
graph node merge / task #33 P3 workflow gate.

* docs(task-61): fix-forward Weston BLOCKER + 5 streaming integration

Weston msg=13dd5e91 BLOCKER (score normalization severity drift):
保持 P0-V3+V4 P0 across §1.1 / §2.2 / §5.3 — score 方向是 caller
语义硬契约,不能在 PGVector/Qdrant 间显示反向。§2.2 加 P0-V3+V4
显式行 + §5.3 加 test_score_normalization_in_vector.py boundary
test (跨 metric × 跨 adapter 全 6 cell parametrize).

Streaming integrations (5 lane):

1. Bryce msg=23a2f514 P0-V1 first-principles 重新定性 — Qdrant
   legacy mode tenant isolation 是 collection name level 不是 query
   filter level (verify qdrant_connector.py:442-446),下沉 P1-V4
   defense-in-depth (legacy mode deprecation follow-up 候选).

2. Bryce msg=8e895471 11 vector findings — 4 P0 (cross-tenant
   下沉 / filter silent / score V3+V4) + 3 P1 (collection init /
   batch atomicity / filter Or 语义) + 4 P2.

3. dongdong msg=4201465a + PR #1929 + cuiwenbo msg=bcec38ad —
   P0-D1 Helm worker Neo4j env missing (Singapore graph viz
   root-cause); P1-D1 e2e shape matrix gap; P1-D2 Nebula no Helm
   first-class; P1-D3 typed schema 缺 vector backend exposure.

4. chenyexuan NIT — Lesson #16 candidate cite added §6.

5. Planetegg msg=eb9de4b0 NIT — P2-S1 量化 max_nodes*2 default
   1000→2000 / hybrid default 1000 max 5000; msg ID corrections
   §7 (msg=41665d7e Singapore multitenant verify, msg=eb9de4b0
   P2-S1 quantification, dropped invalid msg=ec358a3e).

冬柏 PR #1927 commit b2234ae fold-in §5.3 (38 cases incl
zero-side-effect + replay idempotency post-NIT).

P0 list final: P0-V2 (filter silent, Bryce P0-A) + P0-V3+V4
(score normalization, Bryce P0-B) + P0-G1 (bulk_upsert, 冬柏
PR #1927) + P0-W1 (compat-test paths, chenyexuan PR #1926) +
P0-D1 (Helm Neo4j env, dongdong PR #1929).

* docs(task-61): § 3.1.1 historical residue cleanup per Weston msg=fdf04a69 NIT — strike old P0 hot-fix path (P0-V1 已下沉 P1-V4 per Bryce first-principles verify)

* docs(task-61): final consistency cleanup per Weston msg=e414d3cf — line 14 count 4+3+4 to 3 P0 + 4 P1 + 4 P2; § 5.1 P0-V1 line removed; § 5.2 P1-V4 defense-in-depth boundary test added
earayu added a commit that referenced this pull request Apr 30, 2026
…ss-backend (#1927)

* test(compat): task #61 P1 — bulk_upsert_entity_with_lineage_parts cross-backend

PM @不穷 elevated this Protocol method as a P0 audit gap (msg=10b753e8).
Until now ``bulk_upsert_entity_with_lineage_parts`` (Wave 8 W8-2) had
no cross-backend test in `tests/integration/compat/`, even though all
three production backends (Postgres / Neo4j / Nebula) implement it
and the indexing worker uses it for the LineageEntityMerger merge step.

Bulk write paths are exactly where backend differences emerge — batch
size limits, transaction atomicity, error handling, dedup contract —
and the lack of a parametrized matrix here meant any silent drift in
the bulk semantics would survive merge.

This adds 7 new parametrized cases that pin the Protocol contract
declared in `aperag/indexing/graph.py:575+`:

* empty parts is a no-op (no implicit row creation)
* mixed-name parts raise ValueError (atomicity guarantee)
* round-trip: 3 distinct (document_id, parse_version) parts visible after
* dedup last-wins within a single bulk call
* bulk replaces existing rows on matching key (same as single upsert)
* bulk with distinct keys appends, never wipes pre-existing lineage
* per-part entity_type follows last-wins rule

Coverage delta: 30 → 37 cross-backend cases (collect-only verified).

Sister to chenyexuan PR #1926 — without that workflow path fix, this
test never triggered on PRs that touch `aperag/indexing/graph_storage/*`.
Both PRs together restore real CI gating on cross-backend regressions
for the LineageGraphStore Protocol surface.

Part of task #61 DB compat audit (earayu2 directive msg=f26b703e),
testing-lane slice (task #67, claimed via msg=e02c3028).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(compat): task #61 P1 — fold huangheng+ziang NIT into bulk_upsert tests

Two non-blocking NITs from @huangheng msg=99b5ffd5 + @ziang msg=84f5c3cc
re-CR on PR #1927 — fold-in to land more complete test:

* `_rejects_mixed_names` now also asserts post-raise zero-side-effect
  (`get_entity("Alice") is None` + `get_entity("Bob") is None`) — pins
  Lesson #12 v6.4 aggregation-chain invariant: a backend that swapped
  validation order to raise AFTER the first row write would silently
  leak partial state.

* New `_replay_is_idempotent` case — pins the Protocol's "Forward-only
  retry safety: per-part dedup so replays are idempotent" contract.
  A backend that appended on replay (instead of dedup-then-replace)
  would silently duplicate lineage members under retry.

Coverage delta: 37 → 38 cross-backend cases.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(compat): task #61 P1 — fold huangzhangshu description_parts NIT

Per @huangzhangshu testing primary CR (msg=5bbc5d1a) — the bulk_upsert
cases pinned lineage member identity but did not assert
``description_parts`` text content. A backend could write the lineage
member key correctly but silently drop or stale-keep the description
text, breaking the agent context retrieval contract.

Add `description_parts` key→text assertions to 3 cases:

* `_round_trip` — all 3 (doc_id, parse_version) parts must carry their
  source bulk's description text (not silently dropped).
* `_dedup_last_wins_within_bulk` — same-key collapse must keep the
  LAST description text within the bulk (not first).
* `_replaces_existing_same_key` — bulk's strip-then-append must
  overwrite the prior single-write description (not silently keep it).
* `_replay_is_idempotent` — replay must overwrite first call's
  description with the second's (last-wins on replay), not just dedup
  the member.

Coverage delta: same 38 cases, but every dedup/replace/replay case now
pins both lineage AND description_parts text contract.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
earayu added a commit that referenced this pull request Apr 30, 2026
…#1932)

§ 四 加 8 lesson sediment(task #30 B3 + task #61 全 P0 闭环累计实证)+ § 六 sediment 引用追加 6 PR commit cross-link + § 八 修订记录追加本 PR fold trail。

新增 lesson:
- Lesson #12 v7.4: external API raw contract verify (task #61 P0-B PR #1930
  Qdrant euclid raw direction first-application + fix-forward 1e30a00)
- Lesson #12 v8 second-application: test docstring fake guardrail (task #61
  P0-G1 PR #1927 description_parts assertion 缺位 fix-forward 1953933)
- Lesson #12 v9: first-principles verify catch surface signal mistakes
  (task #61 P0-V1 重新定性 Bryce + task #61 P0-B Qdrant euclid Weston catch
  双独立 source 同源 first/second-application)
- Lesson #13 v2.3: deploy manifest dual-side rewrite (task #61 P0-D1 PR #1929
  Helm Neo4j worker env first-application)
- Lesson #13 v3 application demo 2: cross-source default value alignment
  (task #30 B3 PR #1925 commit dae43f5 三 source 同步 first-application)
- Lesson #14 application demo: spec 内部 default 漂浮 multi-iteration cleanup
  (task #30 B3 PR #1925 fix-forward dae43f5 § 3.1.1 line 85 cleanup
  second-application demo, first-application 在 task #35 6 轮 fix-forward)
- Lesson #16: CI workflow paths filter dead reference 反 pattern (task #61
  P0-W1 PR #1926 first-application demo + Lesson #15 file-move 3-step verify
  升级到 v2 4-step grep .github/workflows/*.yml paths 同步)
- Lesson #17: backend 收敛 contract 优于上层 fork (simple-stable + private-deploy
  paramount directive earayu2 msg=1224bec8 在 cross-adapter contract 设计时
  应用; task #69 P0-B + task #70 P1 候选 1 cross-PR 一次性收敛 first-application)

跨 PR 多独立 source 同源 catch trail:
- Lesson #12 v9: Bryce msg=23a2f514 + Weston msg=86e05a8e 双独立 source
- Lesson #16: chenyexuan msg=f298011e + 冬柏 msg=3e93bb64 双独立 source
- Lesson #17: cuiwenbo msg=cedc7703 + Bryce msg=9895a148 双独立 source
- Lesson #13 v3 application demo 2: huangheng msg=bf785b12 + Planetegg
  msg=c63acbf5 + Weston msg=1e6b0838 三独立 source

per architect msg=c4cdf634 + msg=daaeeab5 + msg=03c892e0 sediment dispatch.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant